Fluctuation of the Length of the Longest Common Subsequence
نویسندگان
چکیده
Let X1, . . . , Xn and Y1, . . . , Yn be two independent sequences of i.i.d Bernoulli variables with parameter > 0. Let X designate the string X := X1X2 . . . Xn and let Y := Y1Y2 . . . Yn. Let Ln designate the length of the longest common subsequence (LCS) of X and Y . We prove that for a constant c > 0, VAR[Ln] > cn if > 0 is taken small enough. Hence for small , the order of magnitude of VAR[Ln] is Θ(n). For small , this rejects the Chvatal-Sankoff conjecture that VAR[Ln] = o(n 2 3 ) in [7] and answers to Waterman’s question, whether the linear bound on VAR[Ln] can be improved [14].
منابع مشابه
Fluctuations of the longest common subsequence for sequences of independent blocks
The problem of the order of the fluctuation of the Longest Common Subsequence (LCS) of two independent sequences has been open for decades. There exist contradicting conjectures on the topic, [1] and [2]. Lember and Matzinger [3] showed that with i.i.d. binary strings, the standard deviation of the length of the LCS is asymptotically linear in the length of the strings, provided that 0 and 1 ha...
متن کاملRandom modification effect in the size of the fluctuation of the LCS of two sequences of i.i.d. blocks
The problem of the order of the fluctuation of the Longest Common Subsequence (LCS) of two independent sequences has been open for decades. There exist contradicting conjectures on the topic, [1] and [2]. In the present article, we consider a special model of i.i.d. sequences made out of blocks. A block is a contiguous substring consisting only of one type of symbol. Our model allows only three...
متن کاملAn Efficient Dynamic Programming Algorithm for a New Generalized LCS Problem
In this paper, we consider a generalized longest common subsequence problem, in which a constraining sequence of length s must be included as a substring and the other constraining sequence of length t must be included as a subsequence of two main sequences and the length of the result must be maximal. For the two input sequences X and Y of lengths n and m, and the given two constraining sequen...
متن کاملGeneralizations and Variants of the Largest Non-crossing Matching Problem in Random Bipartite Graphs
A two-rowed array αn = ( a1 a2 . . . an b1 b2 . . . bn ) is said to be in lexicographic order if ak ≤ ak+1 and bk ≤ bk+1 if ak = ak+1. A length ` (strictly) increasing subsequence of αn is a set of indices i1 < i2 < . . . < i` such that bi1 < bi2 < . . . < bi` . We are interested in the statistics of the length of the longest increasing subsequence of αn chosen according to Dn, for distinct fam...
متن کاملExpected Length of the Longest Common Subsequence for Large Alphabets
We consider the length L of the longest common subsequence of two randomly uniformly and independently chosen n character words over a k-ary alphabet. Subadditivity arguments yield that E [L] /n converges to a constant γk. We prove a conjecture of Sankoff and Mainville from the early 80’s claiming that γk √ k → 2 as k → ∞.
متن کامل